NSF PAR Search | NSF Public Access Repository

High-performance CPU Cloth Simulation Using Domain-decomposed Projective Dynamics

https://doi.org/10.1145/3731182

Lu, Zixuan; Liu, Ziheng; Lan, Lei; Wang, Huamin; Ishiwaka, Yuko; Jiang, Chenfanfu; Wu, Kui; Yang, Yin (August 2025, ACM Transactions on Graphics)

Whenever the concept of high-performance cloth simulation is brought up, GPU acceleration is almost always the first that comes to mind. Leveraging immense parallelization, GPU algorithms have demonstrated significant success recently, whereas CPU methods are somewhat overlooked. Indeed, the need for an efficient CPU simulator is evident and pressing. In many scenarios, high-end GPUs may be unavailable or are already allocated to other tasks, such as rendering and shading. A high-performance CPU alternative can greatly boost the overall system capability and user experience. Inspired by this demand, this paper proposes a CPU algorithm for high-resolution cloth simulation. By partitioning the garment model into multiple (but not massive) sub-meshes or domains, we assign per-domain computations to individual CPU processors. Borrowing the idea of projective dynamics that breaks the computation into global and local steps, our key contribution is a new parallelization paradigm at domains for both global and local steps so that domain-level calculations are sequential and lightweight. The CPU has much fewer processing units than a GPU. Our algorithm mitigates this disadvantage by wisely balancing the scale of the parallelization and convergence. We validate our method in a wide range of simulation problems involving high-resolution garment models. Performance-wise, our method is at least one order faster than existing CPU methods, and it delivers a similar performance compared with the state-of-the-art GPU algorithms in many examples, but without using a GPU.

Full Text Available

In parallel simulation, convergence and parallelism are often seen as inherently conflicting objectives. Improved parallelism typically entails lighter local computation and weaker coupling, which unavoidably slow the global convergence. This paper presents a novel GPU algorithm that achieves convergence rates comparable to fullspace Newton's method while maintaining good parallelizability just like the Jacobi method. Our approach is built on a key insight into the phenomenon ofovershoot.Overshoot occurs when a local solver aggressively minimizes its local energy without accounting for the global context, resulting in a local update that undermines global convergence. To address this, we derive a theoretically second-order optimal solution to mitigate overshoot. Furthermore, we adapt this solution into a pre-computable form. Leveraging Cubature sampling, our runtime cost is only marginally higher than the Jacobi method, yet our algorithm converges nearly quadratically as Newton's method. We also introduce a novel full-coordinate formulation for more efficient pre-computation. Our method integrates seamlessly with the incremental potential contact method and achieves second-order convergence for both stiff and soft materials. Experimental results demonstrate that our approach delivers high-quality simulations and outperforms state-of-the-art GPU methods with 50× to 100× better convergence.

Search for: All records